Efficient generalized tensor product codes encoding schemes

ABSTRACT

A method for generating a binary GTP codeword, comprised of N structure stages and each stage comprises at least one BCH codeword with error correction capability greater than a prior stage and smaller than a next stage, includes: receiving a syndrome vector s of a new stage 0 binary BCH codeword y over a field GF(2m) that comprises Δt syndromes of length m bits, wherein the syndrome vector s comprises l-th Reed-Solomon (RS) symbols of Δt RS codewords whose information symbols are delta syndromes of all BCH codewords from stage 0 until stage n−1; and multiplying s by a right submatrix Ũ of a matrix U, wherein U is an inverse of a parity matrix of an BCH code defined by tn, wherein the new binary BCH codeword is y=Ũ·s.

TECHNICAL FIELD

Embodiments of the present disclosure are directed to methods of forming error correcting codes.

DISCUSSION OF THE RELATED ART

In coding theory, BCH codes form a class of cyclic error-correcting codes that are constructed using finite fields. BCH codes are named for their inventors Raj Bose, D. K. Ray-Chaudhuri, and Alexis Hocquenghern. BCH codes can be defined as follows. Let GF(q) be a finite field, where q is a prime power, and choose positive integers m, n, d, c such that 2≤d≤n, gcd(n, q)=1, and m is the multiplicative order of q modulo n. Further, let α be a primitive nth root of unity in GF(q^(m)), and let m_(i)(x) be the minimal polynomial over GF(q) of α_(i) for all i. The generator polynomial of the BCH code is the least common multiple g(x)=1 cm(m_(c)(x), . . . , m_(c+d−2)(x)). BCH codes allow a precise control over the number of symbol errors correctable by the code during code design. In particular, it is possible to design binary BCH codes that can correct multiple bit errors. BCH codes can also be easily decoded using an algebraic method known as syndrome decoding. This can simplify the design of the decoder for these codes by using small, low-power electronic hardware.

The syndrome of a received codeword y=x+e, where x is the transmitted codeword and e is an error pattern, is defined in terms of the parity matrix H. The parity check matrix of a linear code C is a generator matrix of the dual code C^(⊥), which means that a codeword c is in C iff the matrix-vector product Hc=0. Then, the syndrome of the received word y=x+e is defined as S=Hy=H(x+e)=Hx+He=0+He=He. Syndromes are useful in that the message polynomial drops out, and the length of S, the assumed number of errors L, is much less than the length of the codeword x itself.

An exemplary encoding/decoding is depicted in FIG. 1, in which a weak BCH code is given by C_(w) ⊆ F_(q) ^(n), the k_(w)information bits are x ∈ F_(q) ^(k) ^(w) , and the parity check matrix of the weak BCH is H_(w) ∈ F_(q) ^((n−k) ^(w) ^()×n). A stronger BCH C_(s) ⊆ F_(q) ^(n) has parity check matrix H_(s) ∈ F_(q) ^((n−k) ^(s) ^()×n) given by:

$\begin{matrix} {{H_{s} = {\begin{pmatrix} H_{w} \\ H_{d} \end{pmatrix} \in F_{q}^{{({n - k_{s}})} \times n}}},} & (1) \end{matrix}$

where the number of rows of matrix H_(d) is (n−k_(s))−(n−k_(w))=k_(w)−k_(s), where the subscripts s, w, d respectively refer to strong, weak, and delta. Trivially, the information data x ∈ F_(q) ^(k) ^(w) can be encoded and decoded with the weak code C_(w)⊆F_(q) ^(n), and can be encoded and decoded with the strong code C_(s)⊆F_(q) ^(n), as described herein below.

Information data x ∈ F_(q) ^(k) ^(w) can be encoded using a BCH code C_(w) to obtain the codeword c_({dot over (w)}), as illustrated in FIG. 2. The overhead of the strong code is given by H_(d)c_(w) as shown in FIG. 2. This additional overhead may reside on a different page. Note that the overhead stored for a strong code according to an embodiment is not a conventional BCH overhead but H_(d)c_(w) instead. However, the overall code is equivalent to a BCH code, as will be shown herein below. This overhead is protected by additional ECC, so that H_(d)c_(w) is clean from errors.

A read of c_(w) with errors yields:

y=c _(w) +e.   (2)

From y, the syndrome of the weak BCH can be computed:

H_(w)y=H_(w)e.   (3)

If the information can be decoded from syndrome, then the decoding is completed, otherwise, compute:

H _(d) y−H _(d) c _(w) =H _(d) c _(w) +H _(d) e−H _(d) c _(w) =H _(d) e,   (4)

since H_(d)c_(w) was saved in a protected form. Thus there exist both H_(w)e and H_(d)e and therefore, from EQ. (1):

$\begin{matrix} {{H_{s}e} = {\begin{pmatrix} H_{w} \\ H_{d} \end{pmatrix}e}} & (5) \end{matrix}$

Thus, decoding can be performed using the stronger BCH C_(s) with parity check matrix H_(s).

Suppose that a weak BCH has error capability t_(w) and a strong BCH has error capability t_(s). Then the weak code uses the syndromes H_(w)e, and the strong BCH requires the syndromes of the weak BCH H_(w)e, as well as additional syndromes H_(d)e. The Beriekamp-Massey (BM) decoder works sequentially on the syndromes, that is, one syndrome after the other. Therefore, if a weak BCH fails, a decoding procedure of the strong BCH can resume from the same point. It will not be necessary to start decoding the strong BCH from the beginning, and it is therefore more efficient. In contrast, other decoding algorithms, such as Euclidian decoding, work simultaneously on all syndromes. Therefore, decoding of weak BCH and strong BCH codes is performed separately, and consequently will be more time consuming.

Another concern in constructing error correcting codes is how to make them simply decodable and efficient. Bose-Chaudhuri-Hocquenguem (BCH) codes are efficient but not simply decodable. On the other hand, majority-logic decodable codes are simply decodable but generally not efficient. A more general class of error-correcting codes, known as Generalized tensor product (GTP) codes, which can be formed by combining a number of codes on various fields with shorter binary codes, has a decoder for bounded distance decoding that is much simpler than the decoders for other codes of comparable or superior quality. GTP codes were introduced by Imai, et al., “Generalized Tensor Product Codes”, IEEE Transactions on Information Theory, Vol. IT-27, No. 2, March 1981, pgs. 181-187, the contents of which are herein incorporated by reference in their entirety, and the relevant aspects are summarized as follows.

To construct a GTP code, first define the matrices H_(i), i=1,2, . . . , μ, and H_(i), i=1,2, . . . , μ. H_(i) is an r_(i)×n matrix over GF(2) such that the (r₁+r₂+ . . . +r₁)×n matrix

$M_{i} = \begin{bmatrix} H_{1} \\ H_{2} \\ \vdots \\ H_{l} \end{bmatrix}$

is a check matrix of a binary (n, n−r₁−r₂− . . . −r_(i)) code of minimum distance d_(i). H_(i) is a λ_(i)×v matrix over GF(2^(r) ^(i) ) that is a check matrix of a (v, v−λ_(i)) code over GF(2^(r) ^(i) ) of minimum distance δ_(i).

Let ϕ_(kl) ^((i)) ∈ GF(2^(r) ^(i) ) be the (k, l)th element of H_(i), k=1, 2, . . . , λ_(i), l=1, 2, . . . , v. If the matrix representing multiplication by ϕ_(kl) ^((i)) according to some fixed basis of GF(2^(r) ^(i) ) over GF(2) is denoted by Φ_(kl) ^((i)), then the tensor product of the matrices H_(i) and H_(i) is given as

${= {{H_{i} \otimes H_{i}} = \begin{bmatrix} {\Phi_{11}^{(i)}H_{i}} & {\Phi_{12}^{(i)}H_{i}} & \ldots & {\Phi_{1\; v}^{(i)}H_{i}} \\ {\Phi_{21}^{(i)}H_{i}} & \; & \ldots & {\Phi_{2\; v}^{(i)}H_{i}} \\ \vdots & \; & \; & \vdots \\ {\Phi_{\lambda_{i}1}^{(i)}H_{i}} & \ldots & \ldots & {\Phi_{\lambda_{i}v}^{(i)}H_{i}} \end{bmatrix}}},{i = 1},2,\ldots \mspace{14mu},{\mu.}$

In this expression Φ_(kl) ^((i))H_(i) are r_(i)×n matrices of elements from GF(2) and therefore z,900 _(i) is an r_(i)λ_(i)×nv matrix of elements from GF(2),

A GTP code is defined as a linear code having a check matrix of the form

$= \begin{bmatrix}  \\  \\ \vdots \\

\end{bmatrix}$

The code length N and the number of check bits N−K of a GTP code are given by

N=nv,

N−K=Σ_(i=1) ^(μ)r_(i)λ_(i). It can be shown that the GTP representation in the background and a GTP representation according to embodiments of the disclosure as described below are mathematically equal and lead to the same code structure.

SUMMARY

An encoding method according to embodiments of the disclosure uses smaller matrices multiplication followed by multi clock partial parallel multiplications instead of single clock full parallel multiplication. Further, a method according to an embodiment separates a single binary matrix multiplication into several smaller matrices multiplication by using polynomial representations and by utilizing BCH code properties.

According to an embodiment of the disclosure, there is provided a computer implemented method for generating a binary Generalized Tensor Product (GTP) codeword, comprised of N structure stages wherein N is an integer greater than 1 and each stage is comprised of at least one BCH codeword with error correction capability greater than a prior stage and smaller than a next stage, the method including: receiving a syndrome vector s of a new stage 0 binary BCH codeword y over a field GF(2^(m)) that comprises Δt syndromes of length m bits, wherein Δt=t_(n)−t₀, t₀ is the error correction capability of the stage 0 BCH codeword, t_(n) is an error correction capability of a stage n BCH codeword to which y will be added, wherein the syndrome vector s comprises of l-th Reed-Solomon (RS) symbols of Δt RS codewords whose information symbols are delta syndromes of all BCH codewords from stage 0 until stage n−1, wherein l indexes the BCH codeword to which a new binary BCH codeword will be added; and multiplying s by a right submatrix Ũ of a matrix U, wherein U is an inverse of a parity matrix of an BCH code defined by t_(n), wherein the submatrix Ũ is of size mt₀×mΔt, wherein the new binary BCH codeword is y=Ũ·s.

According to a further embodiment of the disclosure, multiplying s by right submatrix Ũ of matrix U comprises multiplying each component of the syndrome vector s by a component of submatrix Ũ by a binary logic function in a single hardware cycle to yield a component product, wherein submatrix Ũ is calculated before receiving syndrome vector s of new binary BCH codeword y, and multiplexing the component products into a single output that represents the new binary BCH codeword y.

According to a further embodiment of the disclosure, the syndrome vector s is demultiplexed into separate Ũ matrices.

According to a further embodiment of the disclosure, multiplying s by right submatrix Ũ of matrix U further includes: multiplying each component of the syndrome vector s by a component of reduced submatrix Ũ′ by a binary logic function in a single hardware cycle to yield a component product, wherein reduced submatrix defined by Ũ′(x)=Ũ (x)/g₀ (x), wherein columns of submatrices Ũ′ and Ũ are represented as polynomials and each column of Ũ′ (x) is the column of Ũ (x) divided by g₀ (x), are calculated before receiving syndrome vector s of new binary BCH codeword y; multiplexing the component products into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.

According to a further embodiment of the disclosure, the syndrome vector s is demultiplexed into separate Ũ′ matrices.

According to a further embodiment of the disclosure, convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles

According to a further embodiment of the disclosure, multiplying s by a right submatrix Ũ of a matrix U further includes: calculating a polynomial h_(j)(x) by multiplying syndrome vector s by a matrix H_(j) formed by concatenating the m polynomials h_(j,t) as columns, wherein polynomials

${h_{j,l}(x)} = {{{{{\overset{\_}{U}}_{{mj} + l}^{\prime}(x)}/{M_{j}(x)}}\mspace{14mu} {and}\mspace{14mu} {M_{j}(x)}} = {\prod\limits_{\underset{i \neq {j + t_{0} + 1}}{i = {t_{0} + 1}}}^{t_{1}}\; {m_{i}(x)}}}$

wherein m_(i)(x) is an i-th minimal polynomial of the BCH code C₁ with correction capability of t₁ and is calculated before receiving syndrome vector s of new binary BCH codeword y; multiplying h_(j)(x) by M_(j)(x), and summing over j=0 to Δt−1; multiplexing the sums of the products h_(j)(x)×M_(j)(x) into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.

According to a further embodiment of the disclosure, the syndrome vector s is demultiplexed into separate sets of H_(j) and M_(j).

According to a further embodiment of the disclosure, convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.

According to another embodiment of the disclosure, there is provided a computer processor configured to execute a program of instructions to perform the method steps generating a binary Generalized Tensor Product (GTP) codeword, comprised of N structure stages wherein N is an integer greater than 1 and each stage is comprised of at least one BCH codeword with error correction capability greater than a prior stage and smaller than a next stage, the method including: receiving a syndrome vectors s of a new stage 0 binary BCH codeword y over a field GF(2^(m)) that comprises Δt syndromes of length m bits, wherein Δt=t_(n)−t₀, t₀ is the error correction capability of a stage 0 BCH codeword, t_(n) is an error correction capability of a stage n BCH codeword to which y will be added, wherein the syndrome vector s comprises of the l-th Reed-Solomon (RS) symbols of Δt RS codewords whose information symbols are delta syndromes of all BCH codewords from stage 0 until stage n−1, wherein l indexes the BCH codeword to which a new binary BCH codeword will be added; and multiplying s by a right submatrix Ũ of a matrix U, wherein U is an inverse of a parity matrix of an BCH code defined by t_(n), wherein the submatrix Ũ is of size mt₀×mΔt, wherein the new binary BCH codeword is y=Ũ·s.

According to a further embodiment of the disclosure, multiplying s by right submatrix Ũ of matrix U comprises multiplying each component of the syndrome vector s by a component of submatrix Ũ by a binary logic function in a single hardware cycle to yield a component product, wherein submatrix Ũ is calculated before receiving syndrome vector s of new binary BCH codeword y, and multiplexing the component products into a single output that represents the new binary BCH codeword y.

Accordin.g to a farther embodiment of the disclosure, the syndrome vector s is demultiplexed into separate Ũ matrices.

According to a further embodiment of the disclosure, multiplying s by right submatrix Ũ of matrix U further includes: multiplying each component of the syndrome vector s by a component of reduced submatrix Ũ′ by a binary logic function in a single hardware cycle to yield a component product, wherein reduced submatrix defined by Ũ′(x)=Ũ (x)/g₀ (x), wherein columns of submatriees Ũ′ and Ũ are represented as polynomials and each column of Ũ′(x) is the column of Ũ (x) divided b g₀ (x), are calculated before receiving syndrome vector s of new binary BCH codeword y: multiplexing the component products into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.

According to a further embodiment of the disclosure, the syndrome vector s is demultiplexed into separate Ũ′ matrices.

to According to a further embodiment of the disclosure, convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.

According to a further embodiment of the disclosure, multiplying s by a right submatrix Ũ of a matrix U further includes: calculating a polynomial h_(j)(x) by multiplying syndrome vector s by a matrix H_(j) formed by concatenating the m polynomials h_(j,l) as columns, wherein polynomials

${h_{j,l}(x)} = {{{{{\overset{\_}{U}}_{{mj} + l}^{\prime}(x)}/{M_{j}(x)}}\mspace{14mu} {and}\mspace{14mu} {M_{j}(x)}} = {\prod\limits_{\underset{i \neq {j + t_{0} + 1}}{i = {t_{0} + 1}}}^{t_{1}}\; {m_{i}(x)}}}$

wherein m_(i)(x) is an i-th minimal polynomial of the BCH code C₁ with correction capability of t₁ and is calculated before receiving syndrome vector s of new binary BCH codeword y; multiplying h_(j)(x) by M_(j)(x), and summing over j=0 to Δt−1; multiplexing the sums of the products h_(j)(x)×M_(j)(x) into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.

According to a further embodiment of the disclosure, the syndrome vector s is demultiplexed into separate sets of H_(j) and M_(j).

According to a further embodiment of the disclosure, convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.

According to a further embodiment of the disclosure, the computer processor is one or more of an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and firmware.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts an exemplary encoding/decoding, according to embodiments of the disclosure.

FIG. 2 illustrates information data encoded using a BCH code to obtain a codeword, according to embodiments of the disclosure.

FIG. 3 illustrates exemplary GTP word and delta syndromes (DS) tables, according to embodiments of the disclosure.

FIG. 4 illustrates an exemplary hardware realization of the y-vec calculation, according to embodiments of the disclosure.

FIG. 5 illustrates an exemplary hardware realization of an embodiment of a y_vec calculation divided into two phases, according to embodiments of the disclosure.

FIG. 6 illustrates an exemplary hardware realization of an embodiment of a y_vec calculation divided into three phases, according to embodiments of the disclosure.

FIG. 7 is a block diagram of a system for efficiently encoding generalized tensor product codes, according to an embodiment of the disclosure.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

Exemplary embodiments of the disclosure as described herein generally provide systems and methods for efficiently encoding Generalized Tensor Product (GTP) codes. While embodiments are susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail, it should be understood, however, that there is no intent to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the disclosure.

A scheme according to an embodiment of the disclosure for encoding GTP codes concatenates several BCH code words (“frames”) with several different correction capabilities, into a single code word, referred to herein as a GTP codeword, where those frames are connected using several Reed Solomon code words to achieve mutual correction capability. A binary sequence added to each frame, i.e., a BCH single code word, to form the connection between the above mentioned BCH code words is referred to herein as a y_vec In GF(2), the binary sequence can be XOR-ed to each frame.

For a linear code C of length n and dimension k over a finite field F, if H ∈ F^((n−k)×n) is a parity-check matrix of C and y ∈ F^(n) is some vector, then the syndrome of y with respect to H is defined as Hy ∈ F^(n−k). Write BCH(m, t) for the primitive binary BCH code of length 2^(m)−1 and designed correction radius t. When C is either BCH(m, t) or a shortened BCH(m, t), then typically n−k=mt, so that syndromes are typically of length mt. In fact, for BCH codes, it is useful to calculate syndromes with respect to a very particular parity check matrix. Let α ∈ GF(2^(m)) be a primitive element. Let C be as above, and let n be the length of C. For a vector y=(y₀ , . . . , y_(n−1))^(T) ∈ GF(2)^(n), let y(X):=y₀+y₁X+ . . . +y_(n−1)X^(n−1) be the associated polynomial, For odd j ∈ {1, . . . , 2t−1}, define S_(j):=y(α^(j))∈ GF(2^(m)). The syndrome of y is defined as S=S^((t)):=(S₁, S₃, . . . , S_(2t−1))^(T). Note that y ∈ C if and only if S=0.

Suppose now that there are two correction radii, t₁<t₂, Then for any y, S^((t) ² ⁾ appears as the first t₁ entries of S^((t) ² ⁾. The remaining t₂−t₁ entries of S^((t) ² ⁾ will be referred to as the delta syndrome vector. Each entry of the delta syndrome vector is referred to as a (single) delta syndrome. Note that when y ∈ BCH(m, t₁), then the first entries of t₁ entries of S^((t) ² ⁾ are zero. Note also that each element of GF(2^(m)), and in particular each entry of a syndrome, can be represented as an m-bit binary vector once a basis for GF(2^(m)) over GF(2) is fixed.

Each GTP frame is a BCH frame, and each may have a different correction capability. According to embodiments, t_(i) is the correction capability of a BCH frame belonging to stage number i, where stage defines a consecutive number of BCH frames with the same correction capability.

A GTP projected encoding scheme according to an embodiment of the disclosure may require that additional data, also referred to herein as projected data, will also be encoded within the BCH frames, along with the information and parity data. This is done by calculating a binary vector, herein called ‘y_vec’, upon the delta syndrome vector and XOR-ing it with the corresponding BCH frame. The trivial matrices used for y_vec calculation y=·s are large and require a large hardware area, where U will be defined below. Each internal ‘t’ requires its own matrix, which requires more code rates and coding stages, and increases the complexity of an encoder.

According to embodiments, an added y_vec fulfills 3 conditions: (1) It is a BCH code word in the BCH code of the first stage; (2) the calculation of the delta syndrome of a y_vec yields a parity symbol of a Reed-Solomon (RS) code word calculated on the lower stages frames; and (3) y_vec is all zeros on the first k_(i) bits to maintain systematic structure of the BCH frame, k_(i) being the number of information bits of the i^(th) BCH frame.

FIG. 3 illustrates exemplary GTP word and delta syndromes (DS) tables, according to embodiments of the disclosure. In particular, the left table presents an example of an GTP code word, having 11 frames spanned over 4 stages, Starting from the top of the table, the first 3 frames belong to stage 0, thus each frame is an n-bit BCH code word with a correction capability of t₀ bits. Each BCH codeword at this stage comprises K₀ information bits 11 and L₀=mt₀ parity bits 12. For higher stages, the correction capability is higher and thus there are more parity bits and fewer information bits per frame. Thus, as illustrated in the table, stage 1 has K₁ information bits, and L₁ parity bits, stage 2 has K₂ information bits, and L₂ parity bits, and stage 3 has K₃ information bits, and L₃ parity bits. Note that K₀>K₁>K₂>K₃, L₀<L₁<L₂<L₃, and that K_(i)+L_(i)=n for i=0 to 3. The right table is a delta syndrome (DS) table for those 11 frames. In this example, there are 4 delta syndromes ΔS_(ji) for each frame, where j is the frame number and i is the delta syndrome number (indices start from zero). The DSs 13 are calculated over the frame, and the DSs 14 are “forced” DSs, calculated by RS (Reed Solomon) encoding. For example, for the first DS there are 3 DSs calculated on the first 3 frames, and those are used as the information of the RS code, which has 8 parity symbols (used as forced DSs for next frames).

According to embodiments, as described above, the input to the y_vec calculator module per frame is a plurality of DSs. A y_vec according to an embodiment can force the frame to which it is added, to be a code word in code C₀ with t₀ correction capability and to have its DSs comply with the corresponding parity symbols of the RS codes. For a simple structure in which there are only two stages, i.e., frames of code C₀ with correction capability t₀ and frames of code C₁ with correction capability t₁, Δt₁=t₁−t₀, this can be viewed as:

$\begin{matrix} {{{H_{1} \cdot \left( {\overset{\_}{x} + \overset{\_}{y}} \right)} = {{\begin{bmatrix} H_{0} \\ {\Delta \; H_{1}} \end{bmatrix} \cdot {y\_ vec}} = {\begin{pmatrix} 0 \\ \vdots \\ {\Delta \; s_{0}} \\ {\Delta \; s_{1}} \\ \vdots \\ {\Delta \; s_{{\Delta \; t_{1}} - 1}} \end{pmatrix} = \begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}}}},} & (1) \end{matrix}$

where the number of rows in H₀ is m·t₀, the number of rows in ΔH₁ is m·Δt₁, s is a vector of forced delta syndromes, i.e., RS parity symbols, and each Δs_(j) has m bits. x is a code word in C₁ and thus has zero syndrome for the entire H matrix, and y is only in C₀ and thus has zero syndromes for H₀, but non-zero delta syndromes for ΔH₁. In addition, adding y to x does not change the systematic nature of x that is, the y must be all zeros, except for the last m·t₁ bits:

y=[0, . . . , 0, y]^(T).

Embodiments of the disclosure can find y that comply with the above formula. A method of calculating y can find a matrix U that fulfills:

$\begin{matrix} {{U \cdot \begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}} = {\overset{\sim}{y} \cdot \left\{ {\begin{matrix} {U\text{:}} & {{mt}_{1} \times {mt}_{1}} \\ {\overset{\_}{s}\text{:}} & {\Delta \; {mt}_{1} \times 1} \\ {\overset{\_}{y}\text{:}} & {{mt}_{1} \times 1} \end{matrix}.} \right.}} & (2) \end{matrix}$

According to an embodiment, a matrix U can be found as follows. First, since a code according to an embodiment is cyclic, there exists a matrix U such that:

U·H ₁ =[A; I],   (3)

where I is the identity matrix of size mt₁×mt₁, A is a matrix of size mt₁×(n −mt₁), and the symbol “; ” denotes concatenation. Multiplying by U on both left hand sides of EQ. (1) produces the following:

${U \cdot H_{1} \cdot \overset{\_}{y}} = {U \cdot {\begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}.}}$

Hence:

${{\left\lbrack {A;I} \right\rbrack \cdot \overset{\_}{y}} = {U \cdot \begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}}},{{{{and}\left\lbrack {A;I} \right\rbrack} \cdot \left\lbrack {0,\ldots \mspace{14mu},0,\overset{\sim}{y}} \right\rbrack^{T}} = {U \cdot {\begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}.}}}$

It can be seen that A multiplies only zeros, and thus its actual values do not matter, and this equation becomes:

${{I \cdot \overset{\sim}{y}} = {\overset{\sim}{y} = {U \cdot \begin{pmatrix} 0 \\ \vdots \\ 0 \\ \overset{\_}{s} \end{pmatrix}}}},$

which is the desired result. Going back to EQ. (3), knowing that A is irelevant, the equation can he formulated as:

U·

=1,   (4)

where

is the right most mt₁×mt₁ submatrix of H₁, Thus, according to embodiments, U can be found by inverting the matrix

.

At this point, the matrix U is an mt₁×mt₁ binary matrix. According to embodiments, U can be calculated offline, and can be realized in hardware by a network of XOR gates, that is—each bit in the vector {tilde over (y)} can be generated by xor-ing several bit of s. However, U may be very large. Taking another look at EQ. (2), it can be seen that U is multiplying mt₀ zeros, thus there is no need to hold. the full U matrix—it is enough to hold only a matrix Ũ of size mt₁×mΔt₁, which is the right side of U. From HW point of view, this is a reduction of complexity, or more explicitly, a reduction of hardware area (gate count) and power.

{tilde over (y)}=Ũ·s.   (5)

FIG. 4 illustrates an exemplary hardware realization of the y-vec calculation, {tilde over (y)}=Ũ·s. For the example in FIG. 3, the first y_vec is calculated for the 4^(th) frame, and s=Δs_(3,0). FIG. 4 is a schematic illustration of a y_vec calculation block. The inputs are the forced DSs, denoted as S_vec, which are de-multiplexed by demultiplexer 22 to different multiplication modules 23 a, . . . , 23 n according to the current stage, and the outputs from the multiplication modules is multiplexed by multiplexer 24 back to a single output, Note that the S_vec is different for each frame, both by size and by data. Each multiplication module 23 a, . . . , 23 n is a logic function realizing one component of the matrix multiplication.

According to embodiments, the complexity of EQ. 5 can be further reduced. Since, from EQ. (4), U·

=I, it can be shown that the following is also valid:

·U=I.

Since H₁ is the syndrome matrix, it can be viewed as “checking” the (zero-padded) columns of U to be code words in code C₁. Looking at a single column of U as the coefficients of a polynomial, such a polynomial would be a code word in BCH code C₁if it is a multiplier of all minimal polynomials generating this code, But, since the multiplication

·U does not result in zero matrix, (there is a single ‘1’ in each result column), none of the columns of U is a code word in C₁. Nevertheless, since every m rows of H₁ check if a polynomial is divisible by a specific minimal polynomial, it means that every column of U, treated as a polynomial, is divisible by all minimal. polynomials generating the code except a single minimal polynomial. When looking at the multiplication of

by Ũ, there is the same effect, except the result is a matrix of the form:

·Ũ=[0; I] ^(T).

The zero matrix is of size mt₀×mΔt₁, which means that all columns of Ũ, treated as polynomials, are multiples of all minimal polynomials up to t₀, and of all higher minimal polynomials but one. According to embodiments, the multiplier of all minimal polynomials up to t₀ will be referred to as g₀(x), which is a common multiplier of all the columns of Ũ. From EQ. (5):

{tilde over (y)}=Ũ·s=Σ _(k=0) ^(mΔt) ¹ ⁻¹ s _(k) {tilde over (Ū)} _(k),

where {tilde over (Ū)}_(k) is the k-th column of Ũ, or in a polynomial representation:

{tilde over (y)}(x)=Σ_(k=0) ^(mΔt) ¹ ⁻¹ s _(k) {tilde over (U)}_(k) (x)=g ₀(x)·Σ_(k=0) ^(mΔt) ¹ ⁻¹ s _(k) Ũ′ _(k)(x), tm (6)

where:

Ũ′ _(k)(x)={tilde over (U)}_(k) (x)/g ₀(x),   (7)

and as described above, Ũ_(k) (x) is divisible by g₀ (x).

So, according to embodiments, the matrix Ũ can be replaced with a new matrix, U′, of size mΔt₁×mΔt₁, where its columns can be calculated offline using EQ. (7). This matrix has a much reduced complexity relative to U′. To calculate {tilde over (y)}, another multiplication, i.e., a polynomial coefficients convolution, is performed after the multiplication:

{tilde over (y)}(x)=(U′·s (x)*g ₀ (x)

The additional convolution can be performed with a very low complexity by multi cycle implementation, thus its added complexity is negligible.

Thus, according to embodiments, the result y=U·s can be achieved in two phases: (1) tmp=U′·s; and (2) y=conv (tmp, g_(t) ₀ (x)). This result has the following properties: (1) the U′ matrices are significantly smaller than the original U matrices; (2) g_(t) ₀ (x) is a polynomial common to all U′ matrices; and (3) multiplicationby g_(t) ₀ (x) can also be realized as a matrix, i.e., an XOR gates network. Moreover, multiplication by g_(t) ₀ (x) can be realized in a multi-cycle format, which enables reuse of a small HW over multiple clock periods, thus further decreasing the matrix size.

FIG. 5 illustrates an exemplary hardware realization of an embodiment of a y_vec calculation divided into two phases as described above, The hardware configuration of FIG. 5 is substantially similar to that of FIG. 4, and thus only differences between the configurations will be described, The output of the demultiplexor 24 is tmp, and a convolution y=conv (tmp, g_(t) ₀ (x)) according to an embodiment is implemented by the block 35. A convolution according to an embodiment is a finite impulse response (FIR) filter, and thus can be performed in several clock cycles using a state (memory) register, instead of in a single clock. cycle, which enables using a smaller logic function.

Embodiments of the disclosure utilize only the zero part of

·Ũ. However, as described above, since the result is of the form [0; I]^(T), there exist more common characteristics of the result columns: every batch of m columns is a multiplier of g₀ (x), as above, but also of all other minimal polynomials, except one. According to embodiments, this can be formulated as follows:

$\mspace{20mu} {{{\overset{\sim}{y}(x)} = {\sum\limits_{k = 0}^{{m\; \Delta \; t_{1}} - 1}{s_{k}{{\overset{\_}{\overset{\sim}{U}}}_{k}(x)}}}},{= {{{g_{0}(x)} \cdot {\sum\limits_{j = 0}^{{\Delta \; t_{1}} - 1}{\sum\limits_{l = 0}^{m - 1}{s_{j,l} \cdot {M_{j}(x)} \cdot {h_{j,l}(x)}}}}} = {{g_{0}(x)} \cdot {\sum\limits_{j = 0}^{{\Delta \; t_{1}} - 1}{{M_{j}(x)}{\sum\limits_{l = 0}^{m - 1}{s_{j,l} \cdot {h_{j,l}(x)}}}}}}}},}$

where s_(j,l) is defined as the s_(mj+l), the m-th bit in the J-th delta syndrome,

${{M_{j}(x)} = {\prod\limits_{\underset{i \neq {j + t_{0} + 1}}{i = {t_{0} + 1}}}^{t_{1}}\; {m_{i}(x)}}},$

m_(i)(x) is the i-th minimal polynomial and h_(j,l)(x)=Ū′_(mj+l)(x)/M_(j)(x), As before, all of the polynomials h_(j,l)(x) and M_(j)(x) can be calculated offline.

According to embodiments, by defining an m×m matrix H_(j) to be the concatenation of the m polynomials h_(j,l), l=0, . . . , m−1. as columns, it can be seen that:

Σ_(t=0) ^(m−1) s _(j,l) ·h _(j,l)(x)=H _(j·) s _(j) =h _(j) (x),

and a final formula according to an embodiment is obtained:

{tilde over (y)}(x)=g ₀(x)·Σ_(j=0) ^(Δt) ¹ ⁻¹ M _(j)(x)·h _(j)(x).

According to another embodiment of the disclosure, a calculation has been divided into three phases:

-   1. Calculating h_(j) (x), -   2. Multiplying by M_(j) (x) and summing, -   3. Multiplying by g₀ (x).

In a first section, each delta syndrome is multiplied by its own offline calculated matrix H_(j). A third section is exactly as described in an embodiment as illustrated in FIG. 5. A second section according to another embodiment combines both multiplication and summation. Note that each hardware stage requires its own H and M matrices, for all delta syndromes that participate in that stage. That is, there is a different hardware for DSI when it is used in stage i and j. The multiplication of two polynomials represented by coefficient vectors can be implemented in a multi cycle operation as follows:

h(x)=h ^(n−1) x ^(n−1) +ah _(n−2) x ^(n−2) + . . . +h ₁ x+h ₀,

and

h(x)·M(x)=h ₀ ·M(x)+x·h ₁ ·M(x)+ . . . +h _(n−1) ·x ^(n−1) ·M(x).

Since for binary polynomials, multiplication by x^(j) is a shift left by j, the multiplication of the two polynomials can be realized as adding each bit of one polynomial multiplied by the other polynomial with the corresponding shift. Thus if the coefficient of x^(m−1) (MSB) of h_(j) (x) is multiplied by M_(i)(x) in the first cycle, and it is added to the bit that was right-shifted by one bit in the multiplication of the coefficient of x^(m−2) of h_(j) (x) by M_(j) (x) in the second cycle, and so on, the result of M_(j)(X)·h_(i)(x) is obtained. Since a sum over j is desired, the same can be performed in a matrix format, that is—define a matrix G which is a concatenation of all M_(j) as columns, and in each cycle i, over in cycles, multiply G by a vector v_(i) =[h₀ ^((i)), h₁ ^((i)), . . . , h_(Δt) ₂ ⁻¹ ^((i))]^(T) and add it to the result shifted by i bits to the right, where h_(j) ^((i)) is defined as the coefficient of x^(i)of (x).

FIG. 6 illustrates an exemplary hardware realization of an embodiment of a y_vec calculation divided into three phases as described above. However, embodiments of the disclosure can encompass any number of phases. In the figure, N is the number of stages, m is the binary BCH code over GF(2^(m)), and dt_(n)=t_(n)−t₀. In addition, for clarity, only blocks on the left-hand side have reference numbers. The figure depicts a HW configuration where multiplication by G is done in m cycles, where the input for each cycle is one bit from each register, and the bits are chosen according to the cycle number. For each cycle, the output bits are summed with one shift relative to a previous cycle. Referring to the figure, blocks 41 represent the components of each of dt_(N−1) syndrome vectors, blocks 42 represents the sums Σ_(l=0) ^(m−1)s_(j,i)·h_(j,i)(x)=H_(j)·s _(j)=h_(j)(x), whose output is stored in registers 43, blocks 44 and represent the sums Σ_(j=0) ^(Δt) ¹ ⁻¹M_(j)(x)·h_(j)(x) with the m shifts, as discussed above, whose results are summed and stored in registers 46, block 47 is a multiplexer, and block 48 represents the final multiplication by g₀(x), which, as described above, may be done in multi cycle operation.

System Implementations

It is to be understood that embodiments of the present disclosure can be implemented in various forms of hardware, software, firmware, special purpose processes, or a combination. thereof in one embodiment, the present disclosure can be implemented in hardware as an application-specific integrated circuit (ASIC), or as a field programmable gate array (FPGA). In another embodiment, the present disclosure can be implemented in software as an application program tangible embodied on a computer readable program storage device. The application program can be uploaded to, and executed by, a machine comprising any suitable architecture.

FIG. 7 is a block diagram of a system for efficiently encoding generalized tensor product codes, according to an embodiment of the disclosure. Referring now to FIG. 7, a computer system 51 for implementing an embodiment of the present disclosure can comprise, inter alia, a central processing unit (CPU) 52, a memory 53 and an input/output (I/O) interface 54. The computer system 51 is generally coupled through the I/O interface 54 to a display 55 and various input devices 56 such as a mouse and a keyboard. The support circuits can include circuits such as cache, power supplies, clock circuits, and a communication bus. The memory 53 can include random access memory (RAM), read only memory (ROM), disk drive, tape drive, etc., or a combinations thereof. The present disclosure can be implemented as a routine 57 that is stored in memory 53 and executed by the CPU 52 to process the signal from the signal source 58. As such, the computer system 51 is a general purpose computer system that becomes a specific purpose computer system when executing the routine 57 of the present disclosure. Alternatively, as described above, embodiments of the present disclosure can be implemented as an ASIC or FPGA 57 that is in signal communication with the CPU 52 to process the signal from the signal source 58.

The computer system 51 also includes an operating system and micro instruction code, The various processes and functions described herein can either be part of the micro instruction. code or part of the application program (or combination thereof) which is executed via the operating system. In addition, various other peripheral devices can be connected to the computer platform such as an additional data storage device and a printing device.

It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures can be implemented in software, the actual connections between the systems components (or the process steps) may differ depending upon the manner in which an embodiment of the present disclosure is programmed. Given the teachings of the present disclosure provided herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present disclosure.

While embodiments of the present disclosure has been described in detail with reference to exemplary embodiments, those skilled in the art will appreciate that various modifications and substitutions can be made thereto without departing from the spirit and scope of the disclosure as set forth in the appended claims. 

1. A computer implemented method for generating a binary Generalized Tensor Product (GTP) codeword, comprised of N structure stages wherein N is an integer greater than 1 and each stage is comprised of at least one BCH codeword with error correction capability greater than a prior stage and smaller than a next stage, the method executed by the computer comprising the steps of: receiving new stage 0binary BCH codeword v over a field (2^(m)) from a communication channel; receiving a syndrome vector s of the new stage 0 binary BCH codeword y that comprises Δt syndromes of length m bits, wherein Δt=t_(n)−t₀, t₀, is the error correction capability of the stage 0 BCH codeword, t_(n) is an error correction capability of a stage n BCH codeword to which a new binary BCH codeword y will be added, wherein the syndrome vector s comprises l-th Reed-Solomon (RS) symbols of Δt RS codewords whose information symbols are delta syndromes of all BCH codewords from stage 0 until stage n−1, Wherein l indexes the BCH codeword to which y will be added; and multiplying s by a right submatrix Ũ of a matrix U, wherein U is an inverse of a parity matrix of an BCH code defined by t_(n), wherein the submatrix Ũ is of size mt₀×mΔt, wherein the new binary BCH codeword is y=Ũ·s.
 2. The method of claim 1, wherein multiplying s by right submatrix Ũ of matrix U comprises multiplying each component of the syndrome vector s by a component of submatrix Ũ by a binary logic function in a single hardware cycle to yield a component product, wherein submatrix Ũ is calculated before receiving syndrome vector s of new binary BCH codeword y, and multiplexing the component products into a single output that represents the new binary BCH codeword y.
 3. The method of claim 2, wherein the syndrome vector s is demultiplexed into separate Ũ matrices.
 4. The method of claim 1, wherein multiplying s by right submatrix Ũ of matrix U further comprises: multiplying each component of the syndrome vector s by a component of reduced submatrix Ũ′ by a binary logic function in a single hardware cycle to yield a component product, wherein reduced submatrix defined by Ũ′(x)=Ũ(x)/g₀(x), wherein columns of submatrices Ũ and Ũ are represented as polynomials and each column of Ũ′(x) is the column of Ũ(x) divided by g₀ (x), are calculated before receiving syndrome vectors s of new binary BCH codeword y; multiplexing the component products into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.
 5. The method of claim 4, wherein the syndrome vector s is demultiplexed into separate Ũ′ matrices.
 6. The method of claim 4, wherein convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.
 7. The method of claim 1, wherein multiplying s by a right submatrix Ũ of a matrix U further comprises: calculating a polynomial h_(j)(x) by multiplying syndrome vector s by a matrix formed by concatenating the m polynomials h_(j,c) as columns, wherein polynomials ${h_{j,l}(x)} = {{{{{\overset{\_}{U}}_{{mj} + l}^{\prime}(x)}/{M_{j}(x)}}\mspace{14mu} {and}\mspace{14mu} {M_{j}(x)}} = {\prod\limits_{\underset{i \neq {j + t_{0} + 1}}{i = {t_{0} + 1}}}^{t_{1}}\; {m_{i}(x)}}}$ wherein m_(i)(x) is an i-th minimal polynomial of the BCH code C₁ with correction capability of t₁ and is calculated before receiving syndrome vector s of new binary BCH codeword y; multiplying h_(j)(x) by M_(j)(x), and summing over j=0 to Δt−1; multiplexing the sums of the products h_(j)(x)×M_(j)(x) into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.
 8. The method of claim 7, wherein the syndrome vector s is demultiplexed into separate sets of H_(j) and M_(j).
 9. The method of claim 7, wherein convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.
 10. A computer processor configured to execute a program of instructions to perform the method steps generating a binary Generalized Tensor Product (GTP) codeword, comprised of N structure stages wherein N is an integer greater than 1 and each stage is comprised of at least one BCH codeword with error correction capability greater than a prior stage and smaller than a next stage, the method comprising the steps of: receiving new stage 0 1binary BCH codeword v over a field GF(2^(m)) from a communication channel; receiving a syndrome vector s of the new stage 0 binary BCH codeword y that comprises Δt syndromes of length m bits, wherein Δt=t_(n)=t₀, t₀ is the error correction capability of the stage 0 BCH codeword, t_(n) is an error correction capability of a stage n BCH codeword to which a new binary BCH codeword y will be added, wherein the syndrome vector s comprises l-th Reed-Solomon (RS) symbols of Δt RS codewords whose information symbols are delta syndromes of all BCH codewords from stage 0 until stage n−1, wherein I indexes the BCH codeword to which y will be added; and multiplying s by a right submatrix Ũ of a matrix U, wherein U is an inverse of a parity matrix of an BCH code defined by t_(n), wherein the submatrix Ũ is of size mt₀×mΔt, wherein the new binary BCH codeword is y=Ũ·s.
 11. The computer processor of claim 10, wherein multiplying s by right submatrix Ũ of matrix U comprises multiplying each component of the syndrome vector s by a component of submatrix Ũ by a binary logic function in a single hardware cycle to yield a component product, wherein submatrix Ũ is calculated before receiving syndrome vector s of new binary BCH codeword y, and multiplexing the component products into a single output that represents the new binary BCH codeword y.
 12. The computer processor of claim 11, wherein the syndrome vector s is demultiplexed into separate Ũ matrices.
 13. The computer processor of claim 10, wherein multiplying s by right submatrix Ũ of matrix U further comprises: multiplying each component of the syndrome vector s by a component of reduced submatrix Ũ′ by a binary logic function in a single hardware cycle to yield a component product, wherein reduced submatrix defined by Ũ′ (x)=Ũ(x)/g₀(x), wherein columns of submatrices Ũ′ and Ũ are represented as polynomials and each column of Ũ′ is the column of Ũ(x) divided by g₀ (x), are calculated before receiving syndrome vector s of new binary BCH codeword y; multiplexing the component products into a temporary output; and convolving the temporary output with a common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vectors s of new binary BCH codeword y.
 14. The computer processor of claim 13, wherein the syndrome vector s is demultiplexed into separate Ũ′ matrices.
 15. The computer processor of claim 13, wherein convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock cycles.
 16. The computer processor of claim 10, wherein multiplying s by a right submatrix Ũ of a matrix U further comprises: calculating a polynomial h_(j)(x) by multiplying syndrome vector s by a matrix H_(j) formed by concatenating the m polynomials h_(j,c) as columns, wherein polynomials ${h_{j,l}(x)} = {{{{{\overset{\_}{U}}_{{mj} + l}^{\prime}(x)}/{M_{j}(x)}}\mspace{14mu} {and}\mspace{14mu} {M_{j}(x)}} = {\prod\limits_{\underset{i \neq {j + t_{0} + 1}}{i = {t_{0} + 1}}}^{t_{1}}\; {m_{i}(x)}}}$ wherein m_(i)(x) is an i-th minimal polynomial of the BCH code C₁ with correction capability of t₁ and is calculated before receiving syndrome vector s of new binary BCH codeword y; multiplying h_(j)(x) by M_(j)(x), and summing over j=0 to Δt−1 ; multiplexing the sums of the products h_(j)(x)×M_(j)(x) into a temporary output; and convolving the temporary output with a. common multiplier g₀(x) to yield the single output that represents the new binary BCH codeword y, wherein g₀(x) is a common multiplier of all columns of submatrix Ũ represented as polynomials and is calculated before receiving syndrome vector s of new binary BCH codeword y.
 17. The computer processor of claim 16, wherein the syndrome vector s is demultiplexed into separate sets of H_(j) and M_(j).
 18. The computer processor of claim 16, wherein convolving the temporary output with a common multiplier g₀(x) is performed over multiple clock. cycles.
 19. The computer processor of claim 10, wherein the computer processor is one or more of an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and firmware. 